Dataset statistics
| Number of variables | 29 |
|---|---|
| Number of observations | 1000000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 221.3 MiB |
| Average record size in memory | 232.0 B |
Variable types
| Categorical | 15 |
|---|---|
| Text | 9 |
| Numeric | 5 |
Claim_Date has constant value "2024-04-24" | Constant |
Claim_ID has unique values | Unique |
Phone_Number has unique values | Unique |
Reproduction
| Analysis started | 2024-05-30 06:19:19.927054 |
|---|---|
| Analysis finished | 2024-05-30 06:25:55.232539 |
| Duration | 6 minutes and 35.31 seconds |
| Software version | ydata-profiling v4.8.3 |
| Download configuration | config.json |
Provider_ID
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Eastern Hospital | |
|---|---|
| Sky Hospital | |
| Moon Healthcare | |
| Asian Medical Center | |
| Sun Clinic |
Length
| Max length | 20 |
|---|---|
| Median length | 15 |
| Mean length | 14.602753 |
| Min length | 10 |
Characters and Unicode
| Total characters | 14602753 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Asian Medical Center |
|---|---|
| 2nd row | Sky Hospital |
| 3rd row | Moon Healthcare |
| 4th row | Sky Hospital |
| 5th row | Sun Clinic |
Common Values
| Value | Count | Frequency (%) |
| Eastern Hospital | 200827 | |
| Sky Hospital | 199948 | |
| Moon Healthcare | 199909 | |
| Asian Medical Center | 199835 | |
| Sun Clinic | 199481 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| hospital | 400775 | |
| eastern | 200827 | |
| sky | 199948 | |
| moon | 199909 | |
| healthcare | 199909 | |
| asian | 199835 | |
| medical | 199835 | |
| center | 199835 | |
| sun | 199481 | |
| clinic | 199481 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1401090 | 9.6% |
| e | 1200150 | 8.2% |
| 1199835 | 8.2% | |
| i | 1199407 | 8.2% |
| n | 1199368 | 8.2% |
| t | 1001346 | 6.9% |
| l | 1000000 | 6.8% |
| s | 801437 | 5.5% |
| o | 800593 | 5.5% |
| H | 600684 | 4.1% |
| Other values (13) | 4198843 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11203083 | |
| Uppercase Letter | 2199835 | 15.1% |
| Space Separator | 1199835 | 8.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1401090 | |
| e | 1200150 | |
| i | 1199407 | |
| n | 1199368 | |
| t | 1001346 | |
| l | 1000000 | |
| s | 801437 | |
| o | 800593 | |
| r | 600571 | |
| c | 599225 | |
| Other values (6) | 1399896 |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 600684 | |
| M | 399744 | |
| S | 399429 | |
| C | 399316 | |
| E | 200827 | 9.1% |
| A | 199835 | 9.1% |
Space Separator
| Value | Count | Frequency (%) |
| 1199835 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13402918 | |
| Common | 1199835 | 8.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1401090 | 10.5% |
| e | 1200150 | 9.0% |
| i | 1199407 | 8.9% |
| n | 1199368 | 8.9% |
| t | 1001346 | 7.5% |
| l | 1000000 | 7.5% |
| s | 801437 | 6.0% |
| o | 800593 | 6.0% |
| H | 600684 | 4.5% |
| r | 600571 | 4.5% |
| Other values (12) | 3598272 |
Common
| Value | Count | Frequency (%) |
| 1199835 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14602753 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1401090 | 9.6% |
| e | 1200150 | 8.2% |
| 1199835 | 8.2% | |
| i | 1199407 | 8.2% |
| n | 1199368 | 8.2% |
| t | 1001346 | 6.9% |
| l | 1000000 | 6.8% |
| s | 801437 | 5.5% |
| o | 800593 | 5.5% |
| H | 600684 | 4.1% |
| Other values (13) | 4198843 |
Claim_ID
Text
UNIQUE 
| Distinct | 1000000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 13 |
|---|---|
| Median length | 12 |
| Mean length | 11.888896 |
| Min length | 7 |
Characters and Unicode
| Total characters | 11888896 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1000000 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | CLAIM_1 |
|---|---|
| 2nd row | CLAIM_2 |
| 3rd row | CLAIM_3 |
| 4th row | CLAIM_4 |
| 5th row | CLAIM_5 |
| Value | Count | Frequency (%) |
| claim_1 | 1 | < 0.1% |
| claim_32 | 1 | < 0.1% |
| claim_30 | 1 | < 0.1% |
| claim_15 | 1 | < 0.1% |
| claim_3 | 1 | < 0.1% |
| claim_4 | 1 | < 0.1% |
| claim_5 | 1 | < 0.1% |
| claim_6 | 1 | < 0.1% |
| claim_7 | 1 | < 0.1% |
| claim_8 | 1 | < 0.1% |
| Other values (999990) | 999990 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 1000000 | 8.4% |
| L | 1000000 | 8.4% |
| A | 1000000 | 8.4% |
| I | 1000000 | 8.4% |
| M | 1000000 | 8.4% |
| _ | 1000000 | 8.4% |
| 1 | 600001 | 5.0% |
| 6 | 600000 | 5.0% |
| 5 | 600000 | 5.0% |
| 8 | 600000 | 5.0% |
| Other values (6) | 3488895 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5888896 | |
| Uppercase Letter | 5000000 | |
| Connector Punctuation | 1000000 | 8.4% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 600001 | |
| 6 | 600000 | |
| 5 | 600000 | |
| 8 | 600000 | |
| 2 | 600000 | |
| 3 | 600000 | |
| 4 | 600000 | |
| 7 | 600000 | |
| 9 | 600000 | |
| 0 | 488895 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 1000000 | |
| L | 1000000 | |
| A | 1000000 | |
| I | 1000000 | |
| M | 1000000 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6888896 | |
| Latin | 5000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| _ | 1000000 | |
| 1 | 600001 | |
| 6 | 600000 | |
| 5 | 600000 | |
| 8 | 600000 | |
| 2 | 600000 | |
| 3 | 600000 | |
| 4 | 600000 | |
| 7 | 600000 | |
| 9 | 600000 |
Latin
| Value | Count | Frequency (%) |
| C | 1000000 | |
| L | 1000000 | |
| A | 1000000 | |
| I | 1000000 | |
| M | 1000000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11888896 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| C | 1000000 | 8.4% |
| L | 1000000 | 8.4% |
| A | 1000000 | 8.4% |
| I | 1000000 | 8.4% |
| M | 1000000 | 8.4% |
| _ | 1000000 | 8.4% |
| 1 | 600001 | 5.0% |
| 6 | 600000 | 5.0% |
| 5 | 600000 | 5.0% |
| 8 | 600000 | 5.0% |
| Other values (6) | 3488895 |
Patient_ID
Text
| Distinct | 329922 |
|---|---|
| Distinct (%) | 33.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 30 |
|---|---|
| Median length | 28 |
| Mean length | 13.276224 |
| Min length | 5 |
Characters and Unicode
| Total characters | 13276224 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 177047 ? |
|---|---|
| Unique (%) | 17.7% |
Sample
| 1st row | Darrell Blair |
|---|---|
| 2nd row | William Young |
| 3rd row | Keith Reynolds |
| 4th row | Andre Kelly |
| 5th row | Terry Gonzales |
| Value | Count | Frequency (%) |
| michael | 22935 | 1.1% |
| smith | 21674 | 1.1% |
| johnson | 17238 | 0.8% |
| james | 16825 | 0.8% |
| david | 15947 | 0.8% |
| jennifer | 14751 | 0.7% |
| john | 14270 | 0.7% |
| williams | 13960 | 0.7% |
| christopher | 13958 | 0.7% |
| thomas | 13686 | 0.7% |
| Other values (1588) | 1879531 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1234816 | 9.3% |
| a | 1225484 | 9.2% |
| 1044775 | 7.9% | |
| n | 997504 | 7.5% |
| r | 953421 | 7.2% |
| i | 804342 | 6.1% |
| o | 716050 | 5.4% |
| l | 673909 | 5.1% |
| s | 599690 | 4.5% |
| t | 461198 | 3.5% |
| Other values (44) | 4565035 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10133023 | |
| Uppercase Letter | 2077116 | 15.6% |
| Space Separator | 1044775 | 7.9% |
| Other Punctuation | 21310 | 0.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1234816 | |
| a | 1225484 | |
| n | 997504 | |
| r | 953421 | |
| i | 804342 | 7.9% |
| o | 716050 | 7.1% |
| l | 673909 | 6.7% |
| s | 599690 | 5.9% |
| t | 461198 | 4.6% |
| h | 447234 | 4.4% |
| Other values (16) | 2019375 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 230824 | 11.1% |
| J | 205645 | 9.9% |
| S | 170291 | 8.2% |
| C | 155605 | 7.5% |
| D | 139935 | 6.7% |
| R | 128561 | 6.2% |
| B | 128232 | 6.2% |
| A | 127158 | 6.1% |
| W | 98467 | 4.7% |
| H | 95791 | 4.6% |
| Other values (16) | 596607 |
Space Separator
| Value | Count | Frequency (%) |
| 1044775 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 21310 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12210139 | |
| Common | 1066085 | 8.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1234816 | 10.1% |
| a | 1225484 | 10.0% |
| n | 997504 | 8.2% |
| r | 953421 | 7.8% |
| i | 804342 | 6.6% |
| o | 716050 | 5.9% |
| l | 673909 | 5.5% |
| s | 599690 | 4.9% |
| t | 461198 | 3.8% |
| h | 447234 | 3.7% |
| Other values (42) | 4096491 |
Common
| Value | Count | Frequency (%) |
| 1044775 | ||
| . | 21310 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13276224 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1234816 | 9.3% |
| a | 1225484 | 9.2% |
| 1044775 | 7.9% | |
| n | 997504 | 7.5% |
| r | 953421 | 7.2% |
| i | 804342 | 6.1% |
| o | 716050 | 5.4% |
| l | 673909 | 5.1% |
| s | 599690 | 4.5% |
| t | 461198 | 3.5% |
| Other values (44) | 4565035 |
Diagnosis_Code
Text
| Distinct | 900 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 6000000 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | DX_714 |
|---|---|
| 2nd row | DX_885 |
| 3rd row | DX_988 |
| 4th row | DX_779 |
| 5th row | DX_644 |
| Value | Count | Frequency (%) |
| dx_715 | 1197 | 0.1% |
| dx_285 | 1195 | 0.1% |
| dx_838 | 1195 | 0.1% |
| dx_514 | 1192 | 0.1% |
| dx_508 | 1190 | 0.1% |
| dx_666 | 1190 | 0.1% |
| dx_421 | 1186 | 0.1% |
| dx_667 | 1185 | 0.1% |
| dx_566 | 1185 | 0.1% |
| dx_848 | 1180 | 0.1% |
| Other values (890) | 988105 |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 1000000 | |
| X | 1000000 | |
| _ | 1000000 | |
| 3 | 311682 | 5.2% |
| 8 | 311624 | 5.2% |
| 4 | 311624 | 5.2% |
| 6 | 311340 | 5.2% |
| 1 | 311331 | 5.2% |
| 5 | 311188 | 5.2% |
| 9 | 310735 | 5.2% |
| Other values (3) | 820476 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3000000 | |
| Uppercase Letter | 2000000 | |
| Connector Punctuation | 1000000 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 311682 | |
| 8 | 311624 | |
| 4 | 311624 | |
| 6 | 311340 | |
| 1 | 311331 | |
| 5 | 311188 | |
| 9 | 310735 | |
| 2 | 310636 | |
| 7 | 310502 | |
| 0 | 199338 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 1000000 | |
| X | 1000000 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4000000 | |
| Latin | 2000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| _ | 1000000 | |
| 3 | 311682 | 7.8% |
| 8 | 311624 | 7.8% |
| 4 | 311624 | 7.8% |
| 6 | 311340 | 7.8% |
| 1 | 311331 | 7.8% |
| 5 | 311188 | 7.8% |
| 9 | 310735 | 7.8% |
| 2 | 310636 | 7.8% |
| 7 | 310502 | 7.8% |
Latin
| Value | Count | Frequency (%) |
| D | 1000000 | |
| X | 1000000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| D | 1000000 | |
| X | 1000000 | |
| _ | 1000000 | |
| 3 | 311682 | 5.2% |
| 8 | 311624 | 5.2% |
| 4 | 311624 | 5.2% |
| 6 | 311340 | 5.2% |
| 1 | 311331 | 5.2% |
| 5 | 311188 | 5.2% |
| 9 | 310735 | 5.2% |
| Other values (3) | 820476 |
Procedure_Code
Text
| Distinct | 9000 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Characters and Unicode
| Total characters | 9000000 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PROC_2648 |
|---|---|
| 2nd row | PROC_9084 |
| 3rd row | PROC_9747 |
| 4th row | PROC_4334 |
| 5th row | PROC_8408 |
| Value | Count | Frequency (%) |
| proc_5757 | 150 | < 0.1% |
| proc_7096 | 148 | < 0.1% |
| proc_3766 | 148 | < 0.1% |
| proc_3818 | 146 | < 0.1% |
| proc_6126 | 145 | < 0.1% |
| proc_4933 | 145 | < 0.1% |
| proc_6065 | 144 | < 0.1% |
| proc_1628 | 144 | < 0.1% |
| proc_4622 | 144 | < 0.1% |
| proc_7294 | 144 | < 0.1% |
| Other values (8990) | 998542 |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 1000000 | |
| R | 1000000 | |
| O | 1000000 | |
| C | 1000000 | |
| _ | 1000000 | |
| 7 | 412180 | 4.6% |
| 1 | 411808 | 4.6% |
| 9 | 411421 | 4.6% |
| 5 | 411343 | 4.6% |
| 4 | 411120 | 4.6% |
| Other values (5) | 1942128 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 4000000 | |
| Decimal Number | 4000000 | |
| Connector Punctuation | 1000000 | 11.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 412180 | |
| 1 | 411808 | |
| 9 | 411421 | |
| 5 | 411343 | |
| 4 | 411120 | |
| 3 | 410897 | |
| 6 | 410830 | |
| 2 | 410730 | |
| 8 | 410263 | |
| 0 | 299408 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 1000000 | |
| R | 1000000 | |
| O | 1000000 | |
| C | 1000000 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 5000000 | |
| Latin | 4000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| _ | 1000000 | |
| 7 | 412180 | |
| 1 | 411808 | |
| 9 | 411421 | |
| 5 | 411343 | |
| 4 | 411120 | |
| 3 | 410897 | |
| 6 | 410830 | |
| 2 | 410730 | |
| 8 | 410263 |
Latin
| Value | Count | Frequency (%) |
| P | 1000000 | |
| R | 1000000 | |
| O | 1000000 | |
| C | 1000000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| P | 1000000 | |
| R | 1000000 | |
| O | 1000000 | |
| C | 1000000 | |
| _ | 1000000 | |
| 7 | 412180 | 4.6% |
| 1 | 411808 | 4.6% |
| 9 | 411421 | 4.6% |
| 5 | 411343 | 4.6% |
| 4 | 411120 | 4.6% |
| Other values (5) | 1942128 |
Claim_Date
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 2024-04-24 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 10000000 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2024-04-24 |
|---|---|
| 2nd row | 2024-04-24 |
| 3rd row | 2024-04-24 |
| 4th row | 2024-04-24 |
| 5th row | 2024-04-24 |
Common Values
| Value | Count | Frequency (%) |
| 2024-04-24 | 1000000 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2024-04-24 | 1000000 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 3000000 | |
| 4 | 3000000 | |
| 0 | 2000000 | |
| - | 2000000 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 8000000 | |
| Dash Punctuation | 2000000 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 3000000 | |
| 4 | 3000000 | |
| 0 | 2000000 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 10000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 3000000 | |
| 4 | 3000000 | |
| 0 | 2000000 | |
| - | 2000000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 3000000 | |
| 4 | 3000000 | |
| 0 | 2000000 | |
| - | 2000000 |
Admission_Date
Categorical
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 2024-04-15 | 33573 |
|---|---|
| 2024-04-02 | 33568 |
| 2024-04-04 | 33564 |
| 2024-04-17 | 33534 |
| 2024-03-31 | 33511 |
| Other values (25) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 10000000 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2024-03-26 |
|---|---|
| 2nd row | 2024-04-07 |
| 3rd row | 2024-04-01 |
| 4th row | 2024-03-31 |
| 5th row | 2024-03-27 |
Common Values
| Value | Count | Frequency (%) |
| 2024-04-15 | 33573 | 3.4% |
| 2024-04-02 | 33568 | 3.4% |
| 2024-04-04 | 33564 | 3.4% |
| 2024-04-17 | 33534 | 3.4% |
| 2024-03-31 | 33511 | 3.4% |
| 2024-04-09 | 33498 | 3.3% |
| 2024-04-20 | 33493 | 3.3% |
| 2024-04-01 | 33488 | 3.3% |
| 2024-03-28 | 33473 | 3.3% |
| 2024-03-27 | 33453 | 3.3% |
| Other values (20) | 664845 |
Length
| Value | Count | Frequency (%) |
| 2024-04-15 | 33573 | 3.4% |
| 2024-04-02 | 33568 | 3.4% |
| 2024-04-04 | 33564 | 3.4% |
| 2024-04-17 | 33534 | 3.4% |
| 2024-03-31 | 33511 | 3.4% |
| 2024-04-09 | 33498 | 3.3% |
| 2024-04-20 | 33493 | 3.3% |
| 2024-04-01 | 33488 | 3.3% |
| 2024-03-28 | 33473 | 3.3% |
| 2024-03-27 | 33453 | 3.3% |
| Other values (20) | 664845 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2400609 | |
| 2 | 2399990 | |
| - | 2000000 | |
| 4 | 1833573 | |
| 1 | 466098 | 4.7% |
| 3 | 399544 | 4.0% |
| 7 | 100409 | 1.0% |
| 5 | 100215 | 1.0% |
| 8 | 99941 | 1.0% |
| 9 | 99911 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 8000000 | |
| Dash Punctuation | 2000000 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2400609 | |
| 2 | 2399990 | |
| 4 | 1833573 | |
| 1 | 466098 | 5.8% |
| 3 | 399544 | 5.0% |
| 7 | 100409 | 1.3% |
| 5 | 100215 | 1.3% |
| 8 | 99941 | 1.2% |
| 9 | 99911 | 1.2% |
| 6 | 99710 | 1.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 10000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 2400609 | |
| 2 | 2399990 | |
| - | 2000000 | |
| 4 | 1833573 | |
| 1 | 466098 | 4.7% |
| 3 | 399544 | 4.0% |
| 7 | 100409 | 1.0% |
| 5 | 100215 | 1.0% |
| 8 | 99941 | 1.0% |
| 9 | 99911 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 2400609 | |
| 2 | 2399990 | |
| - | 2000000 | |
| 4 | 1833573 | |
| 1 | 466098 | 4.7% |
| 3 | 399544 | 4.0% |
| 7 | 100409 | 1.0% |
| 5 | 100215 | 1.0% |
| 8 | 99941 | 1.0% |
| 9 | 99911 | 1.0% |
Discharge_Date
Categorical
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 2024-05-04 | 33631 |
|---|---|
| 2024-05-24 | 33623 |
| 2024-04-27 | 33620 |
| 2024-05-01 | 33515 |
| 2024-05-08 | 33493 |
| Other values (25) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 10000000 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2024-05-08 |
|---|---|
| 2nd row | 2024-05-03 |
| 3rd row | 2024-05-24 |
| 4th row | 2024-04-27 |
| 5th row | 2024-05-12 |
Common Values
| Value | Count | Frequency (%) |
| 2024-05-04 | 33631 | 3.4% |
| 2024-05-24 | 33623 | 3.4% |
| 2024-04-27 | 33620 | 3.4% |
| 2024-05-01 | 33515 | 3.4% |
| 2024-05-08 | 33493 | 3.3% |
| 2024-05-02 | 33430 | 3.3% |
| 2024-05-18 | 33424 | 3.3% |
| 2024-05-22 | 33417 | 3.3% |
| 2024-05-19 | 33409 | 3.3% |
| 2024-05-09 | 33384 | 3.3% |
| Other values (20) | 665054 |
Length
| Value | Count | Frequency (%) |
| 2024-05-04 | 33631 | 3.4% |
| 2024-05-24 | 33623 | 3.4% |
| 2024-04-27 | 33620 | 3.4% |
| 2024-05-01 | 33515 | 3.4% |
| 2024-05-08 | 33493 | 3.3% |
| 2024-05-02 | 33430 | 3.3% |
| 2024-05-18 | 33424 | 3.3% |
| 2024-05-22 | 33417 | 3.3% |
| 2024-05-19 | 33409 | 3.3% |
| 2024-05-09 | 33384 | 3.3% |
| Other values (20) | 665054 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2433572 | |
| 0 | 2400020 | |
| - | 2000000 | |
| 4 | 1300618 | |
| 5 | 899670 | 9.0% |
| 1 | 433168 | 4.3% |
| 3 | 132957 | 1.3% |
| 9 | 100117 | 1.0% |
| 8 | 100083 | 1.0% |
| 7 | 100073 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 8000000 | |
| Dash Punctuation | 2000000 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2433572 | |
| 0 | 2400020 | |
| 4 | 1300618 | |
| 5 | 899670 | 11.2% |
| 1 | 433168 | 5.4% |
| 3 | 132957 | 1.7% |
| 9 | 100117 | 1.3% |
| 8 | 100083 | 1.3% |
| 7 | 100073 | 1.3% |
| 6 | 99722 | 1.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 10000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2433572 | |
| 0 | 2400020 | |
| - | 2000000 | |
| 4 | 1300618 | |
| 5 | 899670 | 9.0% |
| 1 | 433168 | 4.3% |
| 3 | 132957 | 1.3% |
| 9 | 100117 | 1.0% |
| 8 | 100083 | 1.0% |
| 7 | 100073 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2433572 | |
| 0 | 2400020 | |
| - | 2000000 | |
| 4 | 1300618 | |
| 5 | 899670 | 9.0% |
| 1 | 433168 | 4.3% |
| 3 | 132957 | 1.3% |
| 9 | 100117 | 1.0% |
| 8 | 100083 | 1.0% |
| 7 | 100073 | 1.0% |
Claim_Amount
Real number (ℝ)
| Distinct | 629499 |
|---|---|
| Distinct (%) | 62.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5049.9249 |
| Minimum | 100.02 |
|---|---|
| Maximum | 9999.99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 100.02 |
|---|---|
| 5-th percentile | 596.46 |
| Q1 | 2577.415 |
| median | 5051.705 |
| Q3 | 7522.1925 |
| 95-th percentile | 9505.78 |
| Maximum | 9999.99 |
| Range | 9899.97 |
| Interquartile range (IQR) | 4944.7775 |
Descriptive statistics
| Standard deviation | 2856.3685 |
|---|---|
| Coefficient of variation (CV) | 0.56562594 |
| Kurtosis | -1.198428 |
| Mean | 5049.9249 |
| Median Absolute Deviation (MAD) | 2472.43 |
| Skewness | 0.00013385167 |
| Sum | 5.0499249 × 109 |
| Variance | 8158841.2 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8765.06 | 8 | < 0.1% |
| 6980.66 | 8 | < 0.1% |
| 8646.11 | 8 | < 0.1% |
| 3208.51 | 8 | < 0.1% |
| 3010.55 | 8 | < 0.1% |
| 7155.47 | 8 | < 0.1% |
| 9082.22 | 8 | < 0.1% |
| 1771.85 | 8 | < 0.1% |
| 828.37 | 7 | < 0.1% |
| 4566.8 | 7 | < 0.1% |
| Other values (629489) | 999922 |
| Value | Count | Frequency (%) |
| 100.02 | 1 | < 0.1% |
| 100.03 | 2 | |
| 100.06 | 1 | < 0.1% |
| 100.07 | 2 | |
| 100.08 | 3 | |
| 100.09 | 1 | < 0.1% |
| 100.11 | 3 | |
| 100.12 | 1 | < 0.1% |
| 100.13 | 1 | < 0.1% |
| 100.14 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9999.99 | 1 | < 0.1% |
| 9999.95 | 1 | < 0.1% |
| 9999.94 | 2 | |
| 9999.93 | 1 | < 0.1% |
| 9999.92 | 3 | |
| 9999.91 | 2 | |
| 9999.89 | 1 | < 0.1% |
| 9999.88 | 2 | |
| 9999.85 | 2 | |
| 9999.83 | 1 | < 0.1% |
Paid_Amount
Real number (ℝ)
| Distinct | 630895 |
|---|---|
| Distinct (%) | 63.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5023.6087 |
| Minimum | 50.05 |
|---|---|
| Maximum | 10000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 50.05 |
|---|---|
| 5-th percentile | 551.4195 |
| Q1 | 2538.36 |
| median | 5023.16 |
| Q3 | 7509.83 |
| 95-th percentile | 9501.0705 |
| Maximum | 10000 |
| Range | 9949.95 |
| Interquartile range (IQR) | 4971.47 |
Descriptive statistics
| Standard deviation | 2870.8348 |
|---|---|
| Coefficient of variation (CV) | 0.57146863 |
| Kurtosis | -1.1993951 |
| Mean | 5023.6087 |
| Median Absolute Deviation (MAD) | 2485.845 |
| Skewness | 0.00083578856 |
| Sum | 5.0236087 × 109 |
| Variance | 8241692.4 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3603.76 | 9 | < 0.1% |
| 7229.21 | 9 | < 0.1% |
| 1534.1 | 8 | < 0.1% |
| 7144.82 | 8 | < 0.1% |
| 4222.67 | 8 | < 0.1% |
| 3293.1 | 8 | < 0.1% |
| 2749.41 | 8 | < 0.1% |
| 2415.43 | 8 | < 0.1% |
| 7272.34 | 8 | < 0.1% |
| 7135.83 | 8 | < 0.1% |
| Other values (630885) | 999918 |
| Value | Count | Frequency (%) |
| 50.05 | 3 | |
| 50.06 | 1 | < 0.1% |
| 50.08 | 3 | |
| 50.1 | 1 | < 0.1% |
| 50.11 | 4 | |
| 50.17 | 2 | |
| 50.18 | 2 | |
| 50.19 | 1 | < 0.1% |
| 50.2 | 3 | |
| 50.21 | 2 |
| Value | Count | Frequency (%) |
| 10000 | 1 | |
| 9999.96 | 1 | |
| 9999.92 | 2 | |
| 9999.91 | 1 | |
| 9999.88 | 2 | |
| 9999.86 | 1 | |
| 9999.85 | 1 | |
| 9999.82 | 1 | |
| 9999.77 | 2 | |
| 9999.76 | 1 |
Provider_Specialty
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Cardiology | |
|---|---|
| General Medicine | |
| Orthopedics |
Length
| Max length | 16 |
|---|---|
| Median length | 11 |
| Mean length | 12.329301 |
| Min length | 10 |
Characters and Unicode
| Total characters | 12329301 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Orthopedics |
|---|---|
| 2nd row | Cardiology |
| 3rd row | Orthopedics |
| 4th row | Cardiology |
| 5th row | Orthopedics |
Common Values
| Value | Count | Frequency (%) |
| Cardiology | 334684 | |
| General Medicine | 332797 | |
| Orthopedics | 332519 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| cardiology | 334684 | |
| general | 332797 | |
| medicine | 332797 | |
| orthopedics | 332519 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1663707 | |
| i | 1332797 | |
| o | 1001887 | 8.1% |
| r | 1000000 | 8.1% |
| d | 1000000 | 8.1% |
| a | 667481 | 5.4% |
| l | 667481 | 5.4% |
| n | 665594 | 5.4% |
| c | 665316 | 5.4% |
| C | 334684 | 2.7% |
| Other values (10) | 3330354 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10663707 | |
| Uppercase Letter | 1332797 | 10.8% |
| Space Separator | 332797 | 2.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1663707 | |
| i | 1332797 | |
| o | 1001887 | |
| r | 1000000 | |
| d | 1000000 | |
| a | 667481 | |
| l | 667481 | |
| n | 665594 | 6.2% |
| c | 665316 | 6.2% |
| y | 334684 | 3.1% |
| Other values (5) | 1664760 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 334684 | |
| G | 332797 | |
| M | 332797 | |
| O | 332519 |
Space Separator
| Value | Count | Frequency (%) |
| 332797 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11996504 | |
| Common | 332797 | 2.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1663707 | |
| i | 1332797 | |
| o | 1001887 | 8.4% |
| r | 1000000 | 8.3% |
| d | 1000000 | 8.3% |
| a | 667481 | 5.6% |
| l | 667481 | 5.6% |
| n | 665594 | 5.5% |
| c | 665316 | 5.5% |
| C | 334684 | 2.8% |
| Other values (9) | 2997557 |
Common
| Value | Count | Frequency (%) |
| 332797 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12329301 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1663707 | |
| i | 1332797 | |
| o | 1001887 | 8.1% |
| r | 1000000 | 8.1% |
| d | 1000000 | 8.1% |
| a | 667481 | 5.4% |
| l | 667481 | 5.4% |
| n | 665594 | 5.4% |
| c | 665316 | 5.4% |
| C | 334684 | 2.7% |
| Other values (10) | 3330354 |
Patient_Age
Real number (ℝ)
| Distinct | 73 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.988752 |
| Minimum | 18 |
|---|---|
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 36 |
| median | 54 |
| Q3 | 72 |
| 95-th percentile | 87 |
| Maximum | 90 |
| Range | 72 |
| Interquartile range (IQR) | 36 |
Descriptive statistics
| Standard deviation | 21.079818 |
|---|---|
| Coefficient of variation (CV) | 0.39044834 |
| Kurtosis | -1.2022395 |
| Mean | 53.988752 |
| Median Absolute Deviation (MAD) | 18 |
| Skewness | 0.00021125585 |
| Sum | 53988752 |
| Variance | 444.35875 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 27 | 13970 | 1.4% |
| 23 | 13962 | 1.4% |
| 71 | 13939 | 1.4% |
| 22 | 13874 | 1.4% |
| 39 | 13843 | 1.4% |
| 77 | 13832 | 1.4% |
| 67 | 13826 | 1.4% |
| 78 | 13818 | 1.4% |
| 53 | 13818 | 1.4% |
| 32 | 13817 | 1.4% |
| Other values (63) | 861301 |
| Value | Count | Frequency (%) |
| 18 | 13704 | |
| 19 | 13619 | |
| 20 | 13744 | |
| 21 | 13578 | |
| 22 | 13874 | |
| 23 | 13962 | |
| 24 | 13751 | |
| 25 | 13647 | |
| 26 | 13608 | |
| 27 | 13970 |
| Value | Count | Frequency (%) |
| 90 | 13664 | |
| 89 | 13720 | |
| 88 | 13590 | |
| 87 | 13679 | |
| 86 | 13705 | |
| 85 | 13814 | |
| 84 | 13554 | |
| 83 | 13759 | |
| 82 | 13684 | |
| 81 | 13781 |
Patient_Gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Male | |
|---|---|
| Female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.999894 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4999894 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Female |
|---|---|
| 2nd row | Female |
| 3rd row | Male |
| 4th row | Male |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Male | 500053 | |
| Female | 499947 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 500053 | |
| female | 499947 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1499947 | |
| a | 1000000 | |
| l | 1000000 | |
| M | 500053 | 10.0% |
| F | 499947 | 10.0% |
| m | 499947 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3999894 | |
| Uppercase Letter | 1000000 | 20.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1499947 | |
| a | 1000000 | |
| l | 1000000 | |
| m | 499947 | 12.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 500053 | |
| F | 499947 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4999894 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1499947 | |
| a | 1000000 | |
| l | 1000000 | |
| M | 500053 | 10.0% |
| F | 499947 | 10.0% |
| m | 499947 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4999894 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1499947 | |
| a | 1000000 | |
| l | 1000000 | |
| M | 500053 | 10.0% |
| F | 499947 | 10.0% |
| m | 499947 | 10.0% |
Fraud_Label
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1000000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 500449 | |
| 1 | 499551 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 500449 | |
| 1 | 499551 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 500449 | |
| 1 | 499551 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1000000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 500449 | |
| 1 | 499551 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 500449 | |
| 1 | 499551 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 500449 | |
| 1 | 499551 |
Investigation_Details
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Under investigation | |
|---|---|
| Suspicious | |
| Cleared |
Length
| Max length | 19 |
|---|---|
| Median length | 10 |
| Mean length | 12.002113 |
| Min length | 7 |
Characters and Unicode
| Total characters | 12002113 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Cleared |
|---|---|
| 2nd row | Under investigation |
| 3rd row | Cleared |
| 4th row | Suspicious |
| 5th row | Under investigation |
Common Values
| Value | Count | Frequency (%) |
| Under investigation | 333505 | |
| Suspicious | 333351 | |
| Cleared | 333144 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| under | 333505 | |
| investigation | 333505 | |
| suspicious | 333351 | |
| cleared | 333144 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 1667217 | |
| e | 1333298 | |
| n | 1000515 | 8.3% |
| s | 1000207 | 8.3% |
| t | 667010 | 5.6% |
| o | 666856 | 5.6% |
| u | 666702 | 5.6% |
| a | 666649 | 5.6% |
| d | 666649 | 5.6% |
| r | 666649 | 5.6% |
| Other values (9) | 3000361 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10668608 | |
| Uppercase Letter | 1000000 | 8.3% |
| Space Separator | 333505 | 2.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 1667217 | |
| e | 1333298 | |
| n | 1000515 | |
| s | 1000207 | |
| t | 667010 | |
| o | 666856 | 6.3% |
| u | 666702 | 6.2% |
| a | 666649 | 6.2% |
| d | 666649 | 6.2% |
| r | 666649 | 6.2% |
| Other values (5) | 1666856 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 333505 | |
| S | 333351 | |
| C | 333144 |
Space Separator
| Value | Count | Frequency (%) |
| 333505 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11668608 | |
| Common | 333505 | 2.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 1667217 | |
| e | 1333298 | |
| n | 1000515 | 8.6% |
| s | 1000207 | 8.6% |
| t | 667010 | 5.7% |
| o | 666856 | 5.7% |
| u | 666702 | 5.7% |
| a | 666649 | 5.7% |
| d | 666649 | 5.7% |
| r | 666649 | 5.7% |
| Other values (8) | 2666856 |
Common
| Value | Count | Frequency (%) |
| 333505 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12002113 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 1667217 | |
| e | 1333298 | |
| n | 1000515 | 8.3% |
| s | 1000207 | 8.3% |
| t | 667010 | 5.6% |
| o | 666856 | 5.6% |
| u | 666702 | 5.6% |
| a | 666649 | 5.6% |
| d | 666649 | 5.6% |
| r | 666649 | 5.6% |
| Other values (9) | 3000361 |
Policy_Type
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| HMO | |
|---|---|
| PPO |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3000000 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | HMO |
|---|---|
| 2nd row | PPO |
| 3rd row | HMO |
| 4th row | PPO |
| 5th row | PPO |
Common Values
| Value | Count | Frequency (%) |
| HMO | 500937 | |
| PPO | 499063 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| hmo | 500937 | |
| ppo | 499063 |
Most occurring characters
| Value | Count | Frequency (%) |
| O | 1000000 | |
| P | 998126 | |
| H | 500937 | |
| M | 500937 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3000000 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 1000000 | |
| P | 998126 | |
| H | 500937 | |
| M | 500937 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3000000 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| O | 1000000 | |
| P | 998126 | |
| H | 500937 | |
| M | 500937 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| O | 1000000 | |
| P | 998126 | |
| H | 500937 | |
| M | 500937 |
Coverage_Amount
Real number (ℝ)
| Distinct | 367043 |
|---|---|
| Distinct (%) | 36.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2999.7301 |
| Minimum | 1000 |
|---|---|
| Maximum | 5000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 1200.87 |
| Q1 | 2000.05 |
| median | 2999.905 |
| Q3 | 3998.2325 |
| 95-th percentile | 4799.5605 |
| Maximum | 5000 |
| Range | 4000 |
| Interquartile range (IQR) | 1998.1825 |
Descriptive statistics
| Standard deviation | 1154.2059 |
|---|---|
| Coefficient of variation (CV) | 0.3847699 |
| Kurtosis | -1.1989104 |
| Mean | 2999.7301 |
| Median Absolute Deviation (MAD) | 999.125 |
| Skewness | 1.4091863 × 10-5 |
| Sum | 2.9997301 × 109 |
| Variance | 1332191.1 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4364.56 | 13 | < 0.1% |
| 1624.74 | 13 | < 0.1% |
| 4380.77 | 12 | < 0.1% |
| 3957.21 | 12 | < 0.1% |
| 2456.5 | 12 | < 0.1% |
| 1286.24 | 12 | < 0.1% |
| 2992.39 | 12 | < 0.1% |
| 4464.15 | 12 | < 0.1% |
| 1420.82 | 12 | < 0.1% |
| 2544.41 | 11 | < 0.1% |
| Other values (367033) | 999879 |
| Value | Count | Frequency (%) |
| 1000 | 1 | < 0.1% |
| 1000.02 | 2 | |
| 1000.03 | 2 | |
| 1000.05 | 3 | |
| 1000.06 | 4 | |
| 1000.07 | 2 | |
| 1000.08 | 2 | |
| 1000.09 | 1 | < 0.1% |
| 1000.11 | 1 | < 0.1% |
| 1000.12 | 4 |
| Value | Count | Frequency (%) |
| 5000 | 1 | < 0.1% |
| 4999.98 | 1 | < 0.1% |
| 4999.97 | 1 | < 0.1% |
| 4999.96 | 5 | |
| 4999.95 | 2 | < 0.1% |
| 4999.94 | 4 | |
| 4999.93 | 1 | < 0.1% |
| 4999.92 | 3 | |
| 4999.91 | 5 | |
| 4999.9 | 6 |
Total_Charges
Real number (ℝ)
| Distinct | 629457 |
|---|---|
| Distinct (%) | 62.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5047.0554 |
| Minimum | 100.03 |
|---|---|
| Maximum | 9999.99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.6 MiB |
Quantile statistics
| Minimum | 100.03 |
|---|---|
| 5-th percentile | 593.3895 |
| Q1 | 2568.77 |
| median | 5045.89 |
| Q3 | 7524.6475 |
| 95-th percentile | 9500.7 |
| Maximum | 9999.99 |
| Range | 9899.96 |
| Interquartile range (IQR) | 4955.8775 |
Descriptive statistics
| Standard deviation | 2859.1245 |
|---|---|
| Coefficient of variation (CV) | 0.56649358 |
| Kurtosis | -1.202511 |
| Mean | 5047.0554 |
| Median Absolute Deviation (MAD) | 2477.84 |
| Skewness | 0.0014312172 |
| Sum | 5.0470554 × 109 |
| Variance | 8174592.8 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4842.87 | 9 | < 0.1% |
| 980.8 | 8 | < 0.1% |
| 8468.94 | 8 | < 0.1% |
| 1391.03 | 8 | < 0.1% |
| 8489.48 | 8 | < 0.1% |
| 4695.22 | 8 | < 0.1% |
| 5214.51 | 8 | < 0.1% |
| 1751.35 | 7 | < 0.1% |
| 8575.77 | 7 | < 0.1% |
| 6547.37 | 7 | < 0.1% |
| Other values (629447) | 999922 |
| Value | Count | Frequency (%) |
| 100.03 | 1 | < 0.1% |
| 100.04 | 2 | |
| 100.05 | 1 | < 0.1% |
| 100.06 | 2 | |
| 100.08 | 1 | < 0.1% |
| 100.09 | 1 | < 0.1% |
| 100.1 | 2 | |
| 100.11 | 2 | |
| 100.12 | 3 | |
| 100.13 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9999.99 | 1 | |
| 9999.98 | 1 | |
| 9999.95 | 1 | |
| 9999.94 | 1 | |
| 9999.92 | 1 | |
| 9999.91 | 1 | |
| 9999.9 | 2 | |
| 9999.89 | 1 | |
| 9999.88 | 2 | |
| 9999.86 | 1 |
Payment_Type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Check | |
|---|---|
| Electronic Funds Transfer | |
| Credit Card |
Length
| Max length | 25 |
|---|---|
| Median length | 11 |
| Mean length | 13.661502 |
| Min length | 5 |
Characters and Unicode
| Total characters | 13661502 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Credit Card |
|---|---|
| 2nd row | Credit Card |
| 3rd row | Credit Card |
| 4th row | Check |
| 5th row | Electronic Funds Transfer |
Common Values
| Value | Count | Frequency (%) |
| Check | 333862 | |
| Electronic Funds Transfer | 333191 | |
| Credit Card | 332947 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| check | 333862 | |
| electronic | 333191 | |
| funds | 333191 | |
| transfer | 333191 | |
| credit | 332947 | |
| card | 332947 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 1665467 | |
| e | 1333191 | 9.8% |
| c | 1000244 | 7.3% |
| C | 999756 | 7.3% |
| n | 999573 | 7.3% |
| 999329 | 7.3% | |
| d | 999085 | 7.3% |
| s | 666382 | 4.9% |
| t | 666138 | 4.9% |
| i | 666138 | 4.9% |
| Other values (10) | 3666199 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10662844 | |
| Uppercase Letter | 1999329 | 14.6% |
| Space Separator | 999329 | 7.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 1665467 | |
| e | 1333191 | |
| c | 1000244 | |
| n | 999573 | |
| d | 999085 | |
| s | 666382 | 6.2% |
| t | 666138 | 6.2% |
| i | 666138 | 6.2% |
| a | 666138 | 6.2% |
| k | 333862 | 3.1% |
| Other values (5) | 1666626 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 999756 | |
| E | 333191 | 16.7% |
| F | 333191 | 16.7% |
| T | 333191 | 16.7% |
Space Separator
| Value | Count | Frequency (%) |
| 999329 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12662173 | |
| Common | 999329 | 7.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 1665467 | |
| e | 1333191 | |
| c | 1000244 | 7.9% |
| C | 999756 | 7.9% |
| n | 999573 | 7.9% |
| d | 999085 | 7.9% |
| s | 666382 | 5.3% |
| t | 666138 | 5.3% |
| i | 666138 | 5.3% |
| a | 666138 | 5.3% |
| Other values (9) | 3000061 |
Common
| Value | Count | Frequency (%) |
| 999329 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13661502 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 1665467 | |
| e | 1333191 | 9.8% |
| c | 1000244 | 7.3% |
| C | 999756 | 7.3% |
| n | 999573 | 7.3% |
| 999329 | 7.3% | |
| d | 999085 | 7.3% |
| s | 666382 | 4.9% |
| t | 666138 | 4.9% |
| i | 666138 | 4.9% |
| Other values (10) | 3666199 |
State
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Tokyo | |
|---|---|
| Mumbai | |
| Bangkok | |
| Beijing | |
| Seoul |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 5.999581 |
| Min length | 5 |
Characters and Unicode
| Total characters | 5999581 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Bangkok |
|---|---|
| 2nd row | Mumbai |
| 3rd row | Seoul |
| 4th row | Tokyo |
| 5th row | Mumbai |
Common Values
| Value | Count | Frequency (%) |
| Tokyo | 200487 | |
| Mumbai | 200109 | |
| Bangkok | 199945 | |
| Beijing | 199791 | |
| Seoul | 199668 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| tokyo | 200487 | |
| mumbai | 200109 | |
| bangkok | 199945 | |
| beijing | 199791 | |
| seoul | 199668 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 800587 | |
| k | 600377 | |
| i | 599691 | |
| a | 400054 | 6.7% |
| u | 399777 | 6.7% |
| g | 399736 | 6.7% |
| n | 399736 | 6.7% |
| B | 399736 | 6.7% |
| e | 399459 | 6.7% |
| T | 200487 | 3.3% |
| Other values (7) | 1399941 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4999581 | |
| Uppercase Letter | 1000000 | 16.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 800587 | |
| k | 600377 | |
| i | 599691 | |
| a | 400054 | |
| u | 399777 | |
| g | 399736 | |
| n | 399736 | |
| e | 399459 | |
| y | 200487 | 4.0% |
| b | 200109 | 4.0% |
| Other values (3) | 599568 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 399736 | |
| T | 200487 | |
| M | 200109 | |
| S | 199668 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5999581 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 800587 | |
| k | 600377 | |
| i | 599691 | |
| a | 400054 | 6.7% |
| u | 399777 | 6.7% |
| g | 399736 | 6.7% |
| n | 399736 | 6.7% |
| B | 399736 | 6.7% |
| e | 399459 | 6.7% |
| T | 200487 | 3.3% |
| Other values (7) | 1399941 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5999581 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 800587 | |
| k | 600377 | |
| i | 599691 | |
| a | 400054 | 6.7% |
| u | 399777 | 6.7% |
| g | 399736 | 6.7% |
| n | 399736 | 6.7% |
| B | 399736 | 6.7% |
| e | 399459 | 6.7% |
| T | 200487 | 3.3% |
| Other values (7) | 1399941 |
Email
Text
| Distinct | 522063 |
|---|---|
| Distinct (%) | 52.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 34 |
|---|---|
| Median length | 30 |
| Mean length | 21.825615 |
| Min length | 15 |
Characters and Unicode
| Total characters | 21825615 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 351695 ? |
|---|---|
| Unique (%) | 35.2% |
Sample
| 1st row | charlenekoch@example.org |
|---|---|
| 2nd row | ayersmelanie@example.org |
| 3rd row | madison17@example.com |
| 4th row | brittany18@example.org |
| 5th row | nharris@example.net |
| Value | Count | Frequency (%) |
| zsmith@example.com | 96 | < 0.1% |
| tsmith@example.net | 91 | < 0.1% |
| ismith@example.org | 87 | < 0.1% |
| csmith@example.net | 86 | < 0.1% |
| gsmith@example.org | 86 | < 0.1% |
| wsmith@example.net | 83 | < 0.1% |
| psmith@example.org | 83 | < 0.1% |
| ssmith@example.net | 83 | < 0.1% |
| dsmith@example.org | 82 | < 0.1% |
| ysmith@example.org | 81 | < 0.1% |
| Other values (522053) | 999142 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3302750 | |
| a | 2025063 | 9.3% |
| m | 1676717 | 7.7% |
| l | 1577273 | 7.2% |
| o | 1223635 | 5.6% |
| r | 1137302 | 5.2% |
| p | 1129127 | 5.2% |
| n | 1119863 | 5.1% |
| x | 1023327 | 4.7% |
| @ | 1000000 | 4.6% |
| Other values (28) | 6610558 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 19325749 | |
| Other Punctuation | 2000000 | 9.2% |
| Decimal Number | 499866 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 3302750 | |
| a | 2025063 | |
| m | 1676717 | 8.7% |
| l | 1577273 | 8.2% |
| o | 1223635 | 6.3% |
| r | 1137302 | 5.9% |
| p | 1129127 | 5.8% |
| n | 1119863 | 5.8% |
| x | 1023327 | 5.3% |
| t | 757314 | 3.9% |
| Other values (16) | 4353378 |
Decimal Number
| Value | Count | Frequency (%) |
| 6 | 50257 | |
| 3 | 50177 | |
| 1 | 50166 | |
| 7 | 50044 | |
| 0 | 50044 | |
| 5 | 49991 | |
| 8 | 49942 | |
| 2 | 49866 | |
| 9 | 49761 | |
| 4 | 49618 |
Other Punctuation
| Value | Count | Frequency (%) |
| @ | 1000000 | |
| . | 1000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 19325749 | |
| Common | 2499866 | 11.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 3302750 | |
| a | 2025063 | |
| m | 1676717 | 8.7% |
| l | 1577273 | 8.2% |
| o | 1223635 | 6.3% |
| r | 1137302 | 5.9% |
| p | 1129127 | 5.8% |
| n | 1119863 | 5.8% |
| x | 1023327 | 5.3% |
| t | 757314 | 3.9% |
| Other values (16) | 4353378 |
Common
| Value | Count | Frequency (%) |
| @ | 1000000 | |
| . | 1000000 | |
| 6 | 50257 | 2.0% |
| 3 | 50177 | 2.0% |
| 1 | 50166 | 2.0% |
| 7 | 50044 | 2.0% |
| 0 | 50044 | 2.0% |
| 5 | 49991 | 2.0% |
| 8 | 49942 | 2.0% |
| 2 | 49866 | 2.0% |
| Other values (2) | 99379 | 4.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21825615 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 3302750 | |
| a | 2025063 | 9.3% |
| m | 1676717 | 7.7% |
| l | 1577273 | 7.2% |
| o | 1223635 | 5.6% |
| r | 1137302 | 5.2% |
| p | 1129127 | 5.2% |
| n | 1119863 | 5.1% |
| x | 1023327 | 4.7% |
| @ | 1000000 | 4.6% |
| Other values (28) | 6610558 |
Phone_Number
Text
UNIQUE 
| Distinct | 1000000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 22 |
|---|---|
| Median length | 19 |
| Mean length | 16.164689 |
| Min length | 10 |
Characters and Unicode
| Total characters | 16164689 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1000000 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 737.572.4230 |
|---|---|
| 2nd row | 001-284-213-6827x6429 |
| 3rd row | (320)856-6983 |
| 4th row | 860-217-1502 |
| 5th row | 658.620.1024 |
| Value | Count | Frequency (%) |
| 737.572.4230 | 1 | < 0.1% |
| 3583554754 | 1 | < 0.1% |
| 331.692.3101 | 1 | < 0.1% |
| 001-815-565-2083x183 | 1 | < 0.1% |
| 320)856-6983 | 1 | < 0.1% |
| 860-217-1502 | 1 | < 0.1% |
| 658.620.1024 | 1 | < 0.1% |
| 001-658-466-2696 | 1 | < 0.1% |
| 001-729-848-5510x689 | 1 | < 0.1% |
| 712)653-1749x486 | 1 | < 0.1% |
| Other values (999990) | 999990 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 1561496 | |
| 0 | 1361012 | |
| 1 | 1360874 | |
| 3 | 1291978 | |
| 7 | 1291304 | |
| 9 | 1291024 | |
| 5 | 1289568 | |
| 2 | 1289519 | |
| 6 | 1289331 | |
| 8 | 1289199 | |
| Other values (6) | 2849384 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 13042315 | |
| Dash Punctuation | 1561496 | 9.7% |
| Lowercase Letter | 600600 | 3.7% |
| Other Punctuation | 398746 | 2.5% |
| Open Punctuation | 200595 | 1.2% |
| Close Punctuation | 200595 | 1.2% |
| Math Symbol | 160342 | 1.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1361012 | |
| 1 | 1360874 | |
| 3 | 1291978 | |
| 7 | 1291304 | |
| 9 | 1291024 | |
| 5 | 1289568 | |
| 2 | 1289519 | |
| 6 | 1289331 | |
| 8 | 1289199 | |
| 4 | 1288506 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1561496 |
Lowercase Letter
| Value | Count | Frequency (%) |
| x | 600600 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 398746 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 200595 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 200595 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 160342 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 15564089 | |
| Latin | 600600 | 3.7% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 1561496 | |
| 0 | 1361012 | |
| 1 | 1360874 | |
| 3 | 1291978 | |
| 7 | 1291304 | |
| 9 | 1291024 | |
| 5 | 1289568 | |
| 2 | 1289519 | |
| 6 | 1289331 | |
| 8 | 1289199 | |
| Other values (5) | 2248784 |
Latin
| Value | Count | Frequency (%) |
| x | 600600 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16164689 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 1561496 | |
| 0 | 1361012 | |
| 1 | 1360874 | |
| 3 | 1291978 | |
| 7 | 1291304 | |
| 9 | 1291024 | |
| 5 | 1289568 | |
| 2 | 1289519 | |
| 6 | 1289331 | |
| 8 | 1289199 | |
| Other values (6) | 2849384 |
Address
Text
| Distinct | 999998 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 70 |
|---|---|
| Median length | 61 |
| Mean length | 44.709032 |
| Min length | 20 |
Characters and Unicode
| Total characters | 44709032 |
|---|---|
| Distinct characters | 65 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 999996 ? |
|---|---|
| Unique (%) | > 99.9% |
Sample
| 1st row | 9475 Christine Fort, Riveraview, TX 28683 |
|---|---|
| 2nd row | 012 Martinez Bridge, Popeview, OK 75771 |
| 3rd row | 8544 Roberts Estate Apt. 392, Port Mistyshire, WY 86425 |
| 4th row | 52151 Antonio Hill Suite 655, Lake Christian, NH 49512 |
| 5th row | 7112 Christopher Village Suite 120, North Emily, NJ 46503 |
| Value | Count | Frequency (%) |
| apt | 223663 | 3.0% |
| suite | 223393 | 3.0% |
| port | 72202 | 1.0% |
| box | 72052 | 1.0% |
| lake | 67628 | 0.9% |
| west | 64431 | 0.9% |
| south | 64078 | 0.9% |
| east | 63875 | 0.9% |
| north | 63853 | 0.9% |
| new | 63241 | 0.9% |
| Other values (141157) | 6398638 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6377054 | 14.3% | |
| e | 2299219 | 5.1% |
| , | 1928113 | 4.3% |
| a | 1855956 | 4.2% |
| t | 1776293 | 4.0% |
| r | 1659075 | 3.7% |
| i | 1462249 | 3.3% |
| o | 1461518 | 3.3% |
| n | 1389740 | 3.1% |
| s | 1192085 | 2.7% |
| Other values (55) | 23307730 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 19406357 | |
| Decimal Number | 10485857 | |
| Space Separator | 6377054 | 14.3% |
| Uppercase Letter | 6287988 | 14.1% |
| Other Punctuation | 2151776 | 4.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2299219 | |
| a | 1855956 | |
| t | 1776293 | |
| r | 1659075 | 8.5% |
| i | 1462249 | 7.5% |
| o | 1461518 | 7.5% |
| n | 1389740 | 7.2% |
| s | 1192085 | 6.1% |
| l | 1010170 | 5.2% |
| h | 862985 | 4.4% |
| Other values (16) | 4437067 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 735093 | 11.7% |
| S | 699519 | 11.1% |
| P | 460003 | 7.3% |
| M | 431983 | 6.9% |
| C | 385746 | 6.1% |
| N | 350069 | 5.6% |
| L | 254999 | 4.1% |
| D | 254318 | 4.0% |
| R | 243166 | 3.9% |
| W | 243058 | 3.9% |
| Other values (16) | 2230034 |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 1051646 | |
| 6 | 1050617 | |
| 8 | 1050387 | |
| 4 | 1050125 | |
| 7 | 1049378 | |
| 3 | 1049141 | |
| 1 | 1048331 | |
| 2 | 1047967 | |
| 9 | 1047842 | |
| 0 | 1040423 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 1928113 | |
| . | 223663 | 10.4% |
Space Separator
| Value | Count | Frequency (%) |
| 6377054 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 25694345 | |
| Common | 19014687 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2299219 | 8.9% |
| a | 1855956 | 7.2% |
| t | 1776293 | 6.9% |
| r | 1659075 | 6.5% |
| i | 1462249 | 5.7% |
| o | 1461518 | 5.7% |
| n | 1389740 | 5.4% |
| s | 1192085 | 4.6% |
| l | 1010170 | 3.9% |
| h | 862985 | 3.4% |
| Other values (42) | 10725055 |
Common
| Value | Count | Frequency (%) |
| 6377054 | ||
| , | 1928113 | 10.1% |
| 5 | 1051646 | 5.5% |
| 6 | 1050617 | 5.5% |
| 8 | 1050387 | 5.5% |
| 4 | 1050125 | 5.5% |
| 7 | 1049378 | 5.5% |
| 3 | 1049141 | 5.5% |
| 1 | 1048331 | 5.5% |
| 2 | 1047967 | 5.5% |
| Other values (3) | 2311928 | 12.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 44709032 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6377054 | 14.3% | |
| e | 2299219 | 5.1% |
| , | 1928113 | 4.3% |
| a | 1855956 | 4.2% |
| t | 1776293 | 4.0% |
| r | 1659075 | 3.7% |
| i | 1462249 | 3.3% |
| o | 1461518 | 3.3% |
| n | 1389740 | 3.1% |
| s | 1192085 | 2.7% |
| Other values (55) | 23307730 |
Nationality
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Japanese | |
|---|---|
| Thai | |
| Korean | |
| Chinese | |
| Indian |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.201669 |
| Min length | 4 |
Characters and Unicode
| Total characters | 6201669 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Korean |
|---|---|
| 2nd row | Korean |
| 3rd row | Indian |
| 4th row | Thai |
| 5th row | Thai |
Common Values
| Value | Count | Frequency (%) |
| Japanese | 201001 | |
| Thai | 200063 | |
| Korean | 199830 | |
| Chinese | 199793 | |
| Indian | 199313 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| japanese | 201001 | |
| thai | 200063 | |
| korean | 199830 | |
| chinese | 199793 | |
| indian | 199313 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1001418 | |
| a | 1001208 | |
| n | 999250 | |
| i | 599169 | |
| s | 400794 | |
| h | 399856 | 6.4% |
| J | 201001 | 3.2% |
| p | 201001 | 3.2% |
| T | 200063 | 3.2% |
| K | 199830 | 3.2% |
| Other values (5) | 998079 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5201669 | |
| Uppercase Letter | 1000000 | 16.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1001418 | |
| a | 1001208 | |
| n | 999250 | |
| i | 599169 | |
| s | 400794 | |
| h | 399856 | 7.7% |
| p | 201001 | 3.9% |
| o | 199830 | 3.8% |
| r | 199830 | 3.8% |
| d | 199313 | 3.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 201001 | |
| T | 200063 | |
| K | 199830 | |
| C | 199793 | |
| I | 199313 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6201669 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1001418 | |
| a | 1001208 | |
| n | 999250 | |
| i | 599169 | |
| s | 400794 | |
| h | 399856 | 6.4% |
| J | 201001 | 3.2% |
| p | 201001 | 3.2% |
| T | 200063 | 3.2% |
| K | 199830 | 3.2% |
| Other values (5) | 998079 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6201669 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1001418 | |
| a | 1001208 | |
| n | 999250 | |
| i | 599169 | |
| s | 400794 | |
| h | 399856 | 6.4% |
| J | 201001 | 3.2% |
| p | 201001 | 3.2% |
| T | 200063 | 3.2% |
| K | 199830 | 3.2% |
| Other values (5) | 998079 |
Passport_Number
Text
| Distinct | 999457 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
Characters and Unicode
| Total characters | 11000000 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 998914 ? |
|---|---|
| Unique (%) | 99.9% |
Sample
| 1st row | 317-45-4815 |
|---|---|
| 2nd row | 255-16-5382 |
| 3rd row | 255-07-5680 |
| 4th row | 210-10-2570 |
| 5th row | 044-28-8553 |
| Value | Count | Frequency (%) |
| 531-17-4217 | 2 | < 0.1% |
| 734-90-6781 | 2 | < 0.1% |
| 144-98-1327 | 2 | < 0.1% |
| 278-81-0975 | 2 | < 0.1% |
| 779-37-8948 | 2 | < 0.1% |
| 648-07-1991 | 2 | < 0.1% |
| 699-28-0936 | 2 | < 0.1% |
| 402-36-5255 | 2 | < 0.1% |
| 479-51-5826 | 2 | < 0.1% |
| 065-65-8309 | 2 | < 0.1% |
| Other values (999447) | 999980 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 2000000 | |
| 7 | 915769 | |
| 4 | 915162 | |
| 1 | 914201 | |
| 5 | 913177 | |
| 6 | 913082 | |
| 3 | 913017 | |
| 2 | 912359 | |
| 8 | 911622 | |
| 0 | 888901 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 9000000 | |
| Dash Punctuation | 2000000 | 18.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 915769 | |
| 4 | 915162 | |
| 1 | 914201 | |
| 5 | 913177 | |
| 6 | 913082 | |
| 3 | 913017 | |
| 2 | 912359 | |
| 8 | 911622 | |
| 0 | 888901 | |
| 9 | 802710 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2000000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 11000000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 2000000 | |
| 7 | 915769 | |
| 4 | 915162 | |
| 1 | 914201 | |
| 5 | 913177 | |
| 6 | 913082 | |
| 3 | 913017 | |
| 2 | 912359 | |
| 8 | 911622 | |
| 0 | 888901 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11000000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 2000000 | |
| 7 | 915769 | |
| 4 | 915162 | |
| 1 | 914201 | |
| 5 | 913177 | |
| 6 | 913082 | |
| 3 | 913017 | |
| 2 | 912359 | |
| 8 | 911622 | |
| 0 | 888901 |
Employer
Text
| Distinct | 535712 |
|---|---|
| Distinct (%) | 53.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
Length
| Max length | 38 |
|---|---|
| Median length | 33 |
| Mean length | 16.536182 |
| Min length | 5 |
Characters and Unicode
| Total characters | 16536182 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 471426 ? |
|---|---|
| Unique (%) | 47.1% |
Sample
| 1st row | Mitchell-Mcintosh |
|---|---|
| 2nd row | Galloway, Castillo and Smith |
| 3rd row | Phillips, Bryant and Murphy |
| 4th row | Lee, Jackson and Hoffman |
| 5th row | Peterson, Lopez and Blake |
| Value | Count | Frequency (%) |
| and | 389443 | 16.3% |
| plc | 55884 | 2.3% |
| sons | 55868 | 2.3% |
| ltd | 55569 | 2.3% |
| inc | 55517 | 2.3% |
| llc | 55473 | 2.3% |
| group | 55150 | 2.3% |
| smith | 28877 | 1.2% |
| johnson | 22437 | 0.9% |
| williams | 18685 | 0.8% |
| Other values (199047) | 1597151 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1523592 | 9.2% |
| 1390054 | 8.4% | |
| a | 1374546 | 8.3% |
| e | 1189066 | 7.2% |
| r | 1075569 | 6.5% |
| o | 1062224 | 6.4% |
| s | 785990 | 4.8% |
| d | 728511 | 4.4% |
| l | 719158 | 4.3% |
| i | 671214 | 4.1% |
| Other values (44) | 6016258 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11923300 | |
| Uppercase Letter | 2556289 | 15.5% |
| Space Separator | 1390054 | 8.4% |
| Other Punctuation | 333575 | 2.0% |
| Dash Punctuation | 332964 | 2.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1523592 | |
| a | 1374546 | |
| e | 1189066 | |
| r | 1075569 | |
| o | 1062224 | |
| s | 785990 | 6.6% |
| d | 728511 | 6.1% |
| l | 719158 | 6.0% |
| i | 671214 | 5.6% |
| t | 493676 | 4.1% |
| Other values (16) | 2299754 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 304438 | |
| C | 262203 | |
| S | 228105 | 8.9% |
| M | 209373 | 8.2% |
| G | 169790 | 6.6% |
| W | 163436 | 6.4% |
| B | 161440 | 6.3% |
| H | 161302 | 6.3% |
| P | 152631 | 6.0% |
| R | 137733 | 5.4% |
| Other values (15) | 605838 |
Space Separator
| Value | Count | Frequency (%) |
| 1390054 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 333575 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 332964 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 14479589 | |
| Common | 2056593 | 12.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1523592 | 10.5% |
| a | 1374546 | 9.5% |
| e | 1189066 | 8.2% |
| r | 1075569 | 7.4% |
| o | 1062224 | 7.3% |
| s | 785990 | 5.4% |
| d | 728511 | 5.0% |
| l | 719158 | 5.0% |
| i | 671214 | 4.6% |
| t | 493676 | 3.4% |
| Other values (41) | 4856043 |
Common
| Value | Count | Frequency (%) |
| 1390054 | ||
| , | 333575 | 16.2% |
| - | 332964 | 16.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16536182 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1523592 | 9.2% |
| 1390054 | 8.4% | |
| a | 1374546 | 8.3% |
| e | 1189066 | 7.2% |
| r | 1075569 | 6.5% |
| o | 1062224 | 6.4% |
| s | 785990 | 4.8% |
| d | 728511 | 4.4% |
| l | 719158 | 4.3% |
| i | 671214 | 4.1% |
| Other values (44) | 6016258 |
Occupation
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Businessman | |
|---|---|
| Engineer | |
| Teacher | |
| Artist | |
| Doctor |
Length
| Max length | 11 |
|---|---|
| Median length | 8 |
| Mean length | 7.601346 |
| Min length | 6 |
Characters and Unicode
| Total characters | 7601346 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Teacher |
|---|---|
| 2nd row | Businessman |
| 3rd row | Teacher |
| 4th row | Doctor |
| 5th row | Engineer |
Common Values
| Value | Count | Frequency (%) |
| Businessman | 200187 | |
| Engineer | 200149 | |
| Teacher | 200113 | |
| Artist | 199966 | |
| Doctor | 199585 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| businessman | 200187 | |
| engineer | 200149 | |
| teacher | 200113 | |
| artist | 199966 | |
| doctor | 199585 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1000711 | |
| n | 800672 | |
| s | 800527 | |
| r | 799813 | |
| i | 600302 | 7.9% |
| t | 599517 | 7.9% |
| a | 400300 | 5.3% |
| c | 399698 | 5.3% |
| o | 399170 | 5.3% |
| B | 200187 | 2.6% |
| Other values (8) | 1600449 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6601346 | |
| Uppercase Letter | 1000000 | 13.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1000711 | |
| n | 800672 | |
| s | 800527 | |
| r | 799813 | |
| i | 600302 | |
| t | 599517 | |
| a | 400300 | |
| c | 399698 | 6.1% |
| o | 399170 | 6.0% |
| u | 200187 | 3.0% |
| Other values (3) | 600449 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 200187 | |
| E | 200149 | |
| T | 200113 | |
| A | 199966 | |
| D | 199585 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7601346 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1000711 | |
| n | 800672 | |
| s | 800527 | |
| r | 799813 | |
| i | 600302 | 7.9% |
| t | 599517 | 7.9% |
| a | 400300 | 5.3% |
| c | 399698 | 5.3% |
| o | 399170 | 5.3% |
| B | 200187 | 2.6% |
| Other values (8) | 1600449 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7601346 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1000711 | |
| n | 800672 | |
| s | 800527 | |
| r | 799813 | |
| i | 600302 | 7.9% |
| t | 599517 | 7.9% |
| a | 400300 | 5.3% |
| c | 399698 | 5.3% |
| o | 399170 | 5.3% |
| B | 200187 | 2.6% |
| Other values (8) | 1600449 |
Marital_Status
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| Divorced | |
|---|---|
| Single | |
| Widowed | |
| Married |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 7.000107 |
| Min length | 6 |
Characters and Unicode
| Total characters | 7000107 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Divorced |
|---|---|
| 2nd row | Married |
| 3rd row | Divorced |
| 4th row | Divorced |
| 5th row | Divorced |
Common Values
| Value | Count | Frequency (%) |
| Divorced | 250387 | |
| Single | 250280 | |
| Widowed | 249678 | |
| Married | 249655 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| divorced | 250387 | |
| single | 250280 | |
| widowed | 249678 | |
| married | 249655 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 1000000 | |
| e | 1000000 | |
| d | 999398 | |
| r | 749697 | |
| o | 500065 | 7.1% |
| D | 250387 | 3.6% |
| v | 250387 | 3.6% |
| c | 250387 | 3.6% |
| S | 250280 | 3.6% |
| n | 250280 | 3.6% |
| Other values (6) | 1499226 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6000107 | |
| Uppercase Letter | 1000000 | 14.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 1000000 | |
| e | 1000000 | |
| d | 999398 | |
| r | 749697 | |
| o | 500065 | |
| v | 250387 | 4.2% |
| c | 250387 | 4.2% |
| n | 250280 | 4.2% |
| g | 250280 | 4.2% |
| l | 250280 | 4.2% |
| Other values (2) | 499333 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 250387 | |
| S | 250280 | |
| W | 249678 | |
| M | 249655 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7000107 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 1000000 | |
| e | 1000000 | |
| d | 999398 | |
| r | 749697 | |
| o | 500065 | 7.1% |
| D | 250387 | 3.6% |
| v | 250387 | 3.6% |
| c | 250387 | 3.6% |
| S | 250280 | 3.6% |
| n | 250280 | 3.6% |
| Other values (6) | 1499226 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7000107 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 1000000 | |
| e | 1000000 | |
| d | 999398 | |
| r | 749697 | |
| o | 500065 | 7.1% |
| D | 250387 | 3.6% |
| v | 250387 | 3.6% |
| c | 250387 | 3.6% |
| S | 250280 | 3.6% |
| n | 250280 | 3.6% |
| Other values (6) | 1499226 |
Education_Level
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.6 MiB |
| PhD | |
|---|---|
| High School | |
| Bachelor | |
| Master |
Length
| Max length | 11 |
|---|---|
| Median length | 8 |
| Mean length | 6.999697 |
| Min length | 3 |
Characters and Unicode
| Total characters | 6999697 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | High School |
|---|---|
| 2nd row | PhD |
| 3rd row | Master |
| 4th row | Bachelor |
| 5th row | Bachelor |
Common Values
| Value | Count | Frequency (%) |
| PhD | 250678 | |
| High School | 250441 | |
| Bachelor | 249763 | |
| Master | 249118 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| phd | 250678 | |
| high | 250441 | |
| school | 250441 | |
| bachelor | 249763 | |
| master | 249118 |
Most occurring characters
| Value | Count | Frequency (%) |
| h | 1001323 | |
| o | 750645 | 10.7% |
| c | 500204 | 7.1% |
| l | 500204 | 7.1% |
| r | 498881 | 7.1% |
| e | 498881 | 7.1% |
| a | 498881 | 7.1% |
| P | 250678 | 3.6% |
| D | 250678 | 3.6% |
| S | 250441 | 3.6% |
| Other values (8) | 1998881 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5248137 | |
| Uppercase Letter | 1501119 | 21.4% |
| Space Separator | 250441 | 3.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| h | 1001323 | |
| o | 750645 | |
| c | 500204 | |
| l | 500204 | |
| r | 498881 | |
| e | 498881 | |
| a | 498881 | |
| g | 250441 | 4.8% |
| i | 250441 | 4.8% |
| s | 249118 | 4.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 250678 | |
| D | 250678 | |
| S | 250441 | |
| H | 250441 | |
| B | 249763 | |
| M | 249118 |
Space Separator
| Value | Count | Frequency (%) |
| 250441 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6749256 | |
| Common | 250441 | 3.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| h | 1001323 | |
| o | 750645 | |
| c | 500204 | 7.4% |
| l | 500204 | 7.4% |
| r | 498881 | 7.4% |
| e | 498881 | 7.4% |
| a | 498881 | 7.4% |
| P | 250678 | 3.7% |
| D | 250678 | 3.7% |
| S | 250441 | 3.7% |
| Other values (7) | 1748440 |
Common
| Value | Count | Frequency (%) |
| 250441 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6999697 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| h | 1001323 | |
| o | 750645 | 10.7% |
| c | 500204 | 7.1% |
| l | 500204 | 7.1% |
| r | 498881 | 7.1% |
| e | 498881 | 7.1% |
| a | 498881 | 7.1% |
| P | 250678 | 3.6% |
| D | 250678 | 3.6% |
| S | 250441 | 3.6% |
| Other values (8) | 1998881 |
| Admission_Date | Claim_Amount | Coverage_Amount | Discharge_Date | Education_Level | Fraud_Label | Investigation_Details | Marital_Status | Nationality | Occupation | Paid_Amount | Patient_Age | Patient_Gender | Payment_Type | Policy_Type | Provider_ID | Provider_Specialty | State | Total_Charges | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Admission_Date | 1.000 | 0.001 | -0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | -0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | -0.001 |
| Claim_Amount | 0.001 | 1.000 | 0.001 | 0.000 | 0.001 | 0.001 | 0.000 | 0.003 | 0.000 | 0.000 | -0.001 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | -0.000 |
| Coverage_Amount | -0.001 | 0.001 | 1.000 | 0.002 | 0.000 | 0.000 | 0.002 | 0.001 | 0.001 | 0.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 |
| Discharge_Date | 0.000 | 0.000 | 0.002 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.001 | 0.002 | 0.001 | 0.003 | 0.000 | 0.000 | 0.000 | 0.001 |
| Education_Level | 0.000 | 0.001 | 0.000 | 0.000 | 1.000 | 0.001 | 0.002 | 0.002 | 0.000 | 0.002 | -0.001 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | -0.001 |
| Fraud_Label | 0.000 | 0.001 | 0.000 | 0.000 | 0.001 | 1.000 | 0.001 | 0.000 | 0.001 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.003 | 0.000 | 0.001 | 0.000 | 0.001 |
| Investigation_Details | 0.000 | 0.000 | 0.002 | 0.000 | 0.002 | 0.001 | 1.000 | 0.000 | 0.000 | 0.000 | -0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.001 | 0.000 | 0.001 | -0.002 |
| Marital_Status | 0.002 | 0.003 | 0.001 | 0.000 | 0.002 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.001 | 0.001 | 0.000 | 0.002 | 0.000 | 0.000 | 0.001 | 0.001 | 0.000 |
| Nationality | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 1.000 | 0.000 | -0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.001 | 0.001 | 0.000 | -0.002 |
| Occupation | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.001 | 0.001 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.000 |
| Paid_Amount | -0.001 | -0.001 | 0.001 | 0.001 | -0.001 | 0.000 | -0.000 | 0.001 | -0.000 | 0.001 | 1.000 | 0.000 | 0.000 | 0.004 | 0.000 | 0.000 | 0.002 | 0.001 | -0.000 |
| Patient_Age | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.001 | 0.000 | 0.001 | 0.002 | 0.001 | 0.000 | 1.000 | 0.000 | 0.002 | 0.001 | 0.001 | 0.002 | 0.001 | -0.000 |
| Patient_Gender | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.001 | 0.000 | 0.003 | 0.000 | 0.001 | -0.000 |
| Payment_Type | 0.000 | 0.002 | 0.000 | 0.001 | 0.002 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.004 | 0.002 | 0.001 | 1.000 | 0.000 | 0.000 | 0.001 | 0.000 | -0.001 |
| Policy_Type | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.003 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 1.000 | 0.001 | 0.000 | 0.000 | 0.001 |
| Provider_ID | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.001 | 0.001 | 0.000 | 0.001 | 0.003 | 0.000 | 0.001 | 1.000 | 0.000 | 0.001 | 0.002 |
| Provider_Specialty | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.001 | 0.001 | 0.000 | 0.002 | 0.002 | 0.000 | 0.001 | 0.000 | 0.000 | 1.000 | 0.001 | -0.001 |
| State | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.001 | 0.001 | 0.000 | 0.000 | 0.001 | 0.001 | 0.001 | 0.000 | 0.000 | 0.001 | 0.001 | 1.000 | 0.001 |
| Total_Charges | -0.001 | -0.000 | 0.000 | 0.001 | -0.001 | 0.001 | -0.002 | 0.000 | -0.002 | 0.000 | -0.000 | -0.000 | -0.000 | -0.001 | 0.001 | 0.002 | -0.001 | 0.001 | 1.000 |
| Provider_ID | Claim_ID | Patient_ID | Diagnosis_Code | Procedure_Code | Claim_Date | Admission_Date | Discharge_Date | Claim_Amount | Paid_Amount | Provider_Specialty | Patient_Age | Patient_Gender | Fraud_Label | Investigation_Details | Policy_Type | Coverage_Amount | Total_Charges | Payment_Type | State | Phone_Number | Address | Nationality | Passport_Number | Employer | Occupation | Marital_Status | Education_Level | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Asian Medical Center | CLAIM_1 | Darrell Blair | DX_714 | PROC_2648 | 2024-04-24 | 2024-03-26 | 2024-05-08 | 1077.86 | 4362.78 | Orthopedics | 76 | Female | 1 | Cleared | HMO | 3880.49 | 9913.47 | Credit Card | Bangkok | charlenekoch@example.org | 737.572.4230 | 9475 Christine Fort, Riveraview, TX 28683 | Korean | 317-45-4815 | Mitchell-Mcintosh | Teacher | Divorced | High School |
| 1 | Sky Hospital | CLAIM_2 | William Young | DX_885 | PROC_9084 | 2024-04-24 | 2024-04-07 | 2024-05-03 | 4998.88 | 5867.30 | Cardiology | 73 | Female | 1 | Under investigation | PPO | 1541.03 | 7723.89 | Credit Card | Mumbai | ayersmelanie@example.org | 001-284-213-6827x6429 | 012 Martinez Bridge, Popeview, OK 75771 | Korean | 255-16-5382 | Galloway, Castillo and Smith | Businessman | Married | PhD |
| 2 | Moon Healthcare | CLAIM_3 | Keith Reynolds | DX_988 | PROC_9747 | 2024-04-24 | 2024-04-01 | 2024-05-24 | 7058.21 | 8526.15 | Orthopedics | 34 | Male | 0 | Cleared | HMO | 2047.66 | 9671.58 | Credit Card | Seoul | madison17@example.com | (320)856-6983 | 8544 Roberts Estate Apt. 392, Port Mistyshire, WY 86425 | Indian | 255-07-5680 | Phillips, Bryant and Murphy | Teacher | Divorced | Master |
| 3 | Sky Hospital | CLAIM_4 | Andre Kelly | DX_779 | PROC_4334 | 2024-04-24 | 2024-03-31 | 2024-04-27 | 1628.67 | 8317.18 | Cardiology | 58 | Male | 0 | Suspicious | PPO | 3198.92 | 7887.55 | Check | Tokyo | brittany18@example.org | 860-217-1502 | 52151 Antonio Hill Suite 655, Lake Christian, NH 49512 | Thai | 210-10-2570 | Lee, Jackson and Hoffman | Doctor | Divorced | Bachelor |
| 4 | Sun Clinic | CLAIM_5 | Terry Gonzales | DX_644 | PROC_8408 | 2024-04-24 | 2024-03-27 | 2024-05-12 | 1480.43 | 4136.33 | Orthopedics | 90 | Female | 0 | Under investigation | PPO | 2935.93 | 332.80 | Electronic Funds Transfer | Mumbai | nharris@example.net | 658.620.1024 | 7112 Christopher Village Suite 120, North Emily, NJ 46503 | Thai | 044-28-8553 | Peterson, Lopez and Blake | Engineer | Divorced | Bachelor |
| 5 | Moon Healthcare | CLAIM_6 | Tony Gordon | DX_892 | PROC_2335 | 2024-04-24 | 2024-04-01 | 2024-05-19 | 3870.00 | 7623.96 | Orthopedics | 37 | Female | 0 | Suspicious | PPO | 3371.45 | 2564.29 | Electronic Funds Transfer | Tokyo | johnallen@example.org | 001-658-466-2696 | 249 Robert Shoals Apt. 813, East Emily, NJ 06925 | Indian | 335-64-4833 | Odonnell LLC | Engineer | Widowed | PhD |
| 6 | Moon Healthcare | CLAIM_7 | Ross Atkinson | DX_534 | PROC_7298 | 2024-04-24 | 2024-04-06 | 2024-05-22 | 4343.76 | 7886.82 | General Medicine | 44 | Male | 0 | Under investigation | HMO | 4608.70 | 4185.24 | Electronic Funds Transfer | Tokyo | dylanhamilton@example.com | 001-729-848-5510x689 | 26394 Ashley Estates Suite 630, North Steven, PA 62365 | Indian | 289-19-4177 | Howard-Ford | Engineer | Married | High School |
| 7 | Asian Medical Center | CLAIM_8 | Meghan Bryant | DX_235 | PROC_8273 | 2024-04-24 | 2024-04-02 | 2024-05-12 | 1477.68 | 6933.63 | General Medicine | 53 | Male | 0 | Under investigation | PPO | 4629.35 | 2340.75 | Check | Beijing | hayescatherine@example.org | (712)653-1749x486 | 818 Michael Canyon Apt. 314, Paulland, MH 23442 | Japanese | 535-78-2674 | Clark, Oneill and Anderson | Teacher | Widowed | Bachelor |
| 8 | Asian Medical Center | CLAIM_9 | Melissa Petersen | DX_850 | PROC_5437 | 2024-04-24 | 2024-04-20 | 2024-05-09 | 1405.24 | 4547.49 | Cardiology | 59 | Male | 1 | Suspicious | HMO | 1267.65 | 9384.24 | Credit Card | Beijing | darrell46@example.net | (730)907-6852x84174 | 59062 Diana Harbor Apt. 850, West Emilymouth, VT 34613 | Korean | 366-96-8210 | Thompson-Hart | Engineer | Divorced | Master |
| 9 | Asian Medical Center | CLAIM_10 | Diana Schmidt | DX_931 | PROC_1064 | 2024-04-24 | 2024-04-23 | 2024-05-18 | 9063.09 | 916.38 | Orthopedics | 50 | Female | 1 | Cleared | HMO | 2532.27 | 8775.42 | Credit Card | Mumbai | parkerfrank@example.com | +1-330-568-1956x9767 | 661 Emily Gateway, Bergland, GU 41122 | Thai | 088-60-4249 | Johnston, Mclaughlin and Williamson | Doctor | Married | Bachelor |
| Provider_ID | Claim_ID | Patient_ID | Diagnosis_Code | Procedure_Code | Claim_Date | Admission_Date | Discharge_Date | Claim_Amount | Paid_Amount | Provider_Specialty | Patient_Age | Patient_Gender | Fraud_Label | Investigation_Details | Policy_Type | Coverage_Amount | Total_Charges | Payment_Type | State | Phone_Number | Address | Nationality | Passport_Number | Employer | Occupation | Marital_Status | Education_Level | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 999990 | Eastern Hospital | CLAIM_999991 | Lisa Parker | DX_178 | PROC_7308 | 2024-04-24 | 2024-04-12 | 2024-05-05 | 7971.30 | 919.30 | Orthopedics | 48 | Female | 0 | Under investigation | HMO | 2103.45 | 7022.81 | Electronic Funds Transfer | Tokyo | mooresteven@example.org | 698-699-8032x3134 | 4895 Kelly Mission, New Andrea, PA 65056 | Japanese | 316-10-2860 | Cardenas LLC | Doctor | Married | PhD |
| 999991 | Sky Hospital | CLAIM_999992 | Elizabeth Richards | DX_362 | PROC_1005 | 2024-04-24 | 2024-03-30 | 2024-05-23 | 9088.26 | 5037.26 | Cardiology | 69 | Male | 0 | Cleared | PPO | 3592.40 | 9290.50 | Credit Card | Bangkok | hillapril@example.org | +1-345-313-3925x738 | 9273 Victoria Greens Suite 025, Port Jamesside, ID 86274 | Chinese | 630-53-3415 | Marquez-Holmes | Engineer | Married | High School |
| 999992 | Sky Hospital | CLAIM_999993 | Margaret Carlson | DX_625 | PROC_8512 | 2024-04-24 | 2024-04-21 | 2024-05-09 | 5412.93 | 5131.20 | Orthopedics | 25 | Female | 1 | Cleared | PPO | 2414.18 | 5344.02 | Credit Card | Beijing | hrobbins@example.net | 847.549.0627 | 119 Prince Valley, Lake Carlosbury, WI 26754 | Chinese | 694-61-0166 | Higgins Ltd | Teacher | Divorced | Bachelor |
| 999993 | Eastern Hospital | CLAIM_999994 | Alexandra Smith | DX_526 | PROC_1144 | 2024-04-24 | 2024-03-25 | 2024-05-04 | 1273.09 | 3300.31 | General Medicine | 47 | Female | 0 | Cleared | HMO | 4431.86 | 319.08 | Check | Bangkok | lopezronald@example.com | 957-801-6698 | 731 Berg Unions Suite 451, Rossside, ME 80067 | Thai | 815-39-6192 | Murillo-Moore | Doctor | Widowed | High School |
| 999994 | Eastern Hospital | CLAIM_999995 | Ms. Mariah Brown | DX_776 | PROC_6643 | 2024-04-24 | 2024-04-03 | 2024-05-06 | 997.52 | 76.86 | General Medicine | 69 | Female | 1 | Cleared | HMO | 2896.36 | 949.42 | Electronic Funds Transfer | Bangkok | qshea@example.com | (562)500-8789x067 | 2604 Wyatt Junction Suite 541, Patrickville, MI 69528 | Thai | 640-71-5243 | Harper, Wagner and Sampson | Businessman | Married | High School |
| 999995 | Asian Medical Center | CLAIM_999996 | Evelyn Rivers | DX_590 | PROC_1543 | 2024-04-24 | 2024-03-25 | 2024-05-12 | 9576.71 | 3128.20 | Orthopedics | 85 | Male | 0 | Cleared | HMO | 1505.56 | 9891.99 | Check | Seoul | khughes@example.org | 405.871.5546 | 129 Kelly Forges, Longland, AZ 46430 | Thai | 840-84-6763 | Ross-Melendez | Engineer | Widowed | Bachelor |
| 999996 | Moon Healthcare | CLAIM_999997 | Robert Woods | DX_954 | PROC_6870 | 2024-04-24 | 2024-03-26 | 2024-05-09 | 4600.56 | 9604.02 | Orthopedics | 61 | Male | 1 | Cleared | PPO | 4175.44 | 3968.83 | Check | Tokyo | froberts@example.net | 569-771-5484x20259 | 281 Judy Crescent Suite 322, North Edwin, DC 52257 | Japanese | 318-96-8498 | Hansen Group | Engineer | Divorced | High School |
| 999997 | Sun Clinic | CLAIM_999998 | David Thomas | DX_302 | PROC_9405 | 2024-04-24 | 2024-03-25 | 2024-05-20 | 7103.05 | 2955.37 | General Medicine | 41 | Male | 0 | Suspicious | PPO | 1329.91 | 8615.29 | Electronic Funds Transfer | Bangkok | jameswilliams@example.org | (962)996-9863x7621 | 216 Michaela Rapid Suite 198, Williamsmouth, MO 10043 | Chinese | 174-92-1906 | Miller PLC | Teacher | Married | Bachelor |
| 999998 | Asian Medical Center | CLAIM_999999 | Samantha Hubbard | DX_517 | PROC_4916 | 2024-04-24 | 2024-04-05 | 2024-05-21 | 313.71 | 8483.86 | Cardiology | 50 | Female | 0 | Cleared | HMO | 3388.07 | 2163.77 | Electronic Funds Transfer | Beijing | tsmith@example.org | 423-752-9324 | 26176 Joshua Skyway Apt. 043, Riceton, GA 23354 | Thai | 203-65-1589 | Day, Stewart and Frost | Teacher | Single | Master |
| 999999 | Sun Clinic | CLAIM_1000000 | Amy Edwards | DX_699 | PROC_7914 | 2024-04-24 | 2024-03-29 | 2024-05-08 | 1540.96 | 4121.44 | Orthopedics | 51 | Male | 1 | Cleared | HMO | 2752.81 | 3194.72 | Check | Mumbai | barbara46@example.net | 313-781-4871x6894 | 92795 Williams Mills, East Keith, MO 60890 | Indian | 562-34-1333 | Webb-Rogers | Doctor | Divorced | PhD |